1 Preparations

1.1 Packages

1.2 Environment

2 Between-Study Heterogeneity

2.1 Cochran’s Q

  • \(Q\) increases both when the number of studies \(K\), and when the precision (i.e. the sample size of a study) increases. Therefore, \(Q\) and whether it is significant highly depends on the size of your meta-analysis, and thus its statistical power.
  • From this follows that we should not only rely on the significance of a \(Q\)-test when assessing heterogeneity.
  • \(\hat{\theta}\): the pooled effect according to the fixed-effect model.

\[ Q = \sum^{K}_{k=1} \; w_k \; (\hat{\theta}_k - \hat{\theta})^2 \\ w_k = \frac{1}{s_k^2} \] \(\hat{\theta}\): the pooled effect according to the fixed-effect model.

Case 1: no-heterogeneity \(\zeta_k = 0\) and the residuals \(\hat{\theta_k} - \hat{\theta}\) are only product of the sampling error \(\epsilon_k\). \[ \hat{\theta}_k - \hat{\theta} \sim N(0, \; 1) \]

set.seed(2024)
error_fixed <- replicate(n = 10000, rnorm(40))

Case 2: heterogeneity Reference: https://bookdown.org/MathiasHarrer/Doing_Meta_Analysis_in_R/pooling-es.html#rem \[ \hat{\theta}_k - \hat{\theta} = \epsilon_k + \zeta_k \]

set.seed(2024)
error_random <- replicate(n = 10000, rnorm(40) + rnorm(40))

Simplify the formula of \(Q\) a little by assuming that the variance, and thus the weight \(w_k\) of every study, is one, resulting in \(w_k\) to drop out of the equation.

set.seed(2024)
Q_fixed <- replicate(10000, sum(rnorm(40)^2))
Q_random <- replicate(10000, sum((rnorm(40) + rnorm(40))^2))
hist(error_fixed, 
     xlab = expression(hat(theta[k]) ~ - ~ hat(theta)), prob = TRUE, 
     breaks = 100, ylim = c(0, .45), xlim = c(-4, 4), 
     main = "No Heterogeneity")
lines(seq(-4, 4, 0.01), dnorm(seq(-4, 4, 0.01)), 
      col = "blue", lwd = 2)

hist(error_random, 
     xlab = expression(hat(theta[k]) ~ - ~ hat(theta)), prob = TRUE, 
     breaks = 100, ylim = c(0, .45), xlim = c(-4, 4), 
     main = "Heterogeneity")
lines(seq(-4, 4, 0.01), dnorm(seq(-4, 4, 0.01)), 
      col = "blue", lwd = 2)

df <- 40 - 1
hist(Q_fixed, xlab = expression(italic("Q")), prob = TRUE, 
     breaks = 100, ylim = c(0, .06), xlim = c(0, 160), 
     main = "No Heterogeneity")
lines(seq(0, 100, 0.01), dchisq(seq(0, 100, 0.01), df = df), 
      col = "blue", lwd = 2)

hist(Q_random,  xlab = expression(italic("Q")), prob = TRUE, 
     breaks = 100, ylim = c(0, .06), xlim = c(0, 160), 
     main = "Heterogeneity")
lines(seq(0, 150, 0.01), dchisq(seq(0, 150, 0.01), df = df * 2), 
      col = "blue", lwd = 2)

2.2 Higgins & Thompson’s I^2 Statistic

  • Defined as the percentage of variability in the effect sizes that is not caused by sampling error.
  • It quantifies, in percent, how much the observed value of \(Q\) exceeds the expected \(Q\) value when there is no heterogeneity (i.e. K - 1) \[ I^2 = \frac{Max\{Q - (K - 1), \; 0\}}{Q} \ge 0 \]
hist((Q_fixed - 39) / Q_fixed, breaks = 100)

hist((Q_random - 39) / Q_random, breaks = 100)

2.3 The H^2 Statistic

  • A little more elegant than the one of \(I^2\), because we do not have to artificially correct its value when \(Q\) is smaller than \(K - 1\).
  • Values greater than one indicate the presence of between-study heterogeneity.
  • Compared to \(I^2\), it is far less common to find this statistic reported in published meta-analyses. \[ H^2 = \frac{Q}{K-1} \]

2.4 Q-Profile

  • The Q-Profile method is based on an altered \(Q\) version, the generalized Q-statistic \(Q_{gen}\).
  • While the standard version of \(Q\) uses the pooled effect based on the fixed-effect model, \(Q_{gen}\) is based on the random-effects model.
  • \(\hat{u}\): the overall effect according to the random-effects model
  • The Q-Profile method can be specified in meta functions through the argument method.tau.ci = "QP". This is the default setting.
  • \(Q_{gen}(\tilde{\tau}^2)\) is calculated repeatedly while increasing the value of \(\tau^2\), until the expected value of the lower and upper bound of the confidence interval based on the \(\chi^2\) distribution (\(K-1\) DoF) is reached.

\[ Q_{gen} = \sum^{K}_{k=1} \; w_k^* \; (\hat{\theta}_k - \hat{u})^2 \\ w_k^* = \frac{1}{s_k^2 \; + \; \tau^2} \]

2.5 Prediction Intervals (PIs)

  • A good way to overcome limitation of other measures (become significant when studies have a greater sample size).
  • Give us a range into which we can expect the effects of future studies to fall based on present evidence.
  • To calculate prediction intervals around the overall effect \(\hat{u}\), we use both the estimated between-study heterogeneity variance \(\hat{\tau}^2\), as well as the standard error of the pooled effect, \(SE_{\hat{u}}\).
  • When running a meta-analysis, we have to add the argument prediction = TRUE so that prediction intervals appear in the output. \[ \hat{u} \; \pm \; t_{K-1, 0.975} \; \sqrt{SE^2_{\hat{u}} \; + \; \hat{\tau}^2} \\ \hat{u} \; \pm \; t_{K-1, 0.975} \; SD_{PI} \]

2.6 Assesing Heterogeneity in R

data(ThirdWave)
m.gen <- metagen(TE = TE,
                 seTE = seTE,
                 studlab = Author,
                 data = ThirdWave,
                 sm = "SMD",
                 fixed = FALSE,
                 random = TRUE,
                 method.tau = "REML",
                 method.random.ci = "HK",
                 title = "Third Wave Psychotherapies")
m.gen <- update(m.gen, prediction = TRUE)
summary(m.gen)
## Review:     Third Wave Psychotherapies
## 
##                           SMD            95%-CI %W(random)
## Call et al.            0.7091 [ 0.1979; 1.2203]        5.0
## Cavanagh et al.        0.3549 [-0.0300; 0.7397]        6.3
## DanitzOrsillo          1.7912 [ 1.1139; 2.4685]        3.8
## de Vibe et al.         0.1825 [-0.0484; 0.4133]        7.9
## Frazier et al.         0.4219 [ 0.1380; 0.7057]        7.3
## Frogeli et al.         0.6300 [ 0.2458; 1.0142]        6.3
## Gallego et al.         0.7249 [ 0.2846; 1.1652]        5.7
## Hazlett-Stevens & Oren 0.5287 [ 0.1162; 0.9412]        6.0
## Hintz et al.           0.2840 [-0.0453; 0.6133]        6.9
## Kang et al.            1.2751 [ 0.6142; 1.9360]        3.9
## Kuhlmann et al.        0.1036 [-0.2781; 0.4853]        6.3
## Lever Taylor et al.    0.3884 [-0.0639; 0.8407]        5.6
## Phang et al.           0.5407 [ 0.0619; 1.0196]        5.3
## Rasanen et al.         0.4262 [-0.0794; 0.9317]        5.1
## Ratanasiripong         0.5154 [-0.1731; 1.2039]        3.7
## Shapiro et al.         1.4797 [ 0.8618; 2.0977]        4.2
## Song & Lindquist       0.6126 [ 0.1683; 1.0569]        5.7
## Warnecke et al.        0.6000 [ 0.1120; 1.0880]        5.2
## 
## Number of studies: k = 18
## 
##                              SMD            95%-CI    t  p-value
## Random effects model (HK) 0.5771 [ 0.3782; 0.7760] 6.12 < 0.0001
## Prediction interval              [-0.0572; 1.2115]              
## 
## Quantifying heterogeneity:
##  tau^2 = 0.0820 [0.0295; 0.3533]; tau = 0.2863 [0.1717; 0.5944]
##  I^2 = 62.6% [37.9%; 77.5%]; H = 1.64 [1.27; 2.11]
## 
## Test of heterogeneity:
##      Q d.f. p-value
##  45.50   17  0.0002
## 
## Details on meta-analytical method:
## - Inverse variance method
## - Restricted maximum-likelihood estimator for tau^2
## - Q-Profile method for confidence interval of tau^2 and tau
## - Hartung-Knapp adjustment for random effects model (df = 17)
## - Prediction interval based on t-distribution (df = 16)

2.7 Outliers and Influential Cases

  • Assessing the robustness of our pooled results: outlier and influence analyses.

2.7.1 Basic Outlier Removal

  • View a study as an outlier if its confidence interval does not overlap with the confidence interval of the pooled effect.
m.gen.rem <- dmetar::find.outliers(m.gen)
summary(m.gen.rem)
## Identified outliers (random-effects model) 
## ------------------------------------------ 
## "DanitzOrsillo", "Shapiro et al." 
##  
## Results with outliers removed 
## ----------------------------- 
## Review:     Third Wave Psychotherapies
## 
## Number of studies: k = 16
## 
##                              SMD           95%-CI    t  p-value
## Random effects model (HK) 0.4528 [0.3257; 0.5800] 7.59 < 0.0001
## Prediction interval              [0.1687; 0.7369]              
## 
## Quantifying heterogeneity:
##  tau^2 = 0.0139 [0.0000; 0.1032]; tau = 0.1180 [0.0000; 0.3213]
##  I^2 = 24.8% [0.0%; 58.7%]; H = 1.15 [1.00; 1.56]
## 
## Test of heterogeneity:
##      Q d.f. p-value
##  19.95   15  0.1739
## 
## Details on meta-analytical method:
## - Inverse variance method
## - Restricted maximum-likelihood estimator for tau^2
## - Q-Profile method for confidence interval of tau^2 and tau
## - Hartung-Knapp adjustment for random effects model (df = 15)
## - Prediction interval based on t-distribution (df = 14)
summary(m.gen.rem$m.random)
## Review:     Third Wave Psychotherapies
## 
##                           SMD            95%-CI %W(random) exclude
## Call et al.            0.7091 [ 0.1979; 1.2203]        4.4        
## Cavanagh et al.        0.3549 [-0.0300; 0.7397]        6.9        
## DanitzOrsillo          1.7912 [ 1.1139; 2.4685]        0.0       *
## de Vibe et al.         0.1825 [-0.0484; 0.4133]       13.1        
## Frazier et al.         0.4219 [ 0.1380; 0.7057]       10.4        
## Frogeli et al.         0.6300 [ 0.2458; 1.0142]        6.9        
## Gallego et al.         0.7249 [ 0.2846; 1.1652]        5.6        
## Hazlett-Stevens & Oren 0.5287 [ 0.1162; 0.9412]        6.2        
## Hintz et al.           0.2840 [-0.0453; 0.6133]        8.6        
## Kang et al.            1.2751 [ 0.6142; 1.9360]        2.8        
## Kuhlmann et al.        0.1036 [-0.2781; 0.4853]        7.0        
## Lever Taylor et al.    0.3884 [-0.0639; 0.8407]        5.4        
## Phang et al.           0.5407 [ 0.0619; 1.0196]        4.9        
## Rasanen et al.         0.4262 [-0.0794; 0.9317]        4.5        
## Ratanasiripong         0.5154 [-0.1731; 1.2039]        2.6        
## Shapiro et al.         1.4797 [ 0.8618; 2.0977]        0.0       *
## Song & Lindquist       0.6126 [ 0.1683; 1.0569]        5.6        
## Warnecke et al.        0.6000 [ 0.1120; 1.0880]        4.8        
## 
## Number of studies: k = 16
## 
##                              SMD           95%-CI    t  p-value
## Random effects model (HK) 0.4528 [0.3257; 0.5800] 7.59 < 0.0001
## Prediction interval              [0.1687; 0.7369]              
## 
## Quantifying heterogeneity:
##  tau^2 = 0.0139 [0.0000; 0.1032]; tau = 0.1180 [0.0000; 0.3213]
##  I^2 = 24.8% [0.0%; 58.7%]; H = 1.15 [1.00; 1.56]
## 
## Test of heterogeneity:
##      Q d.f. p-value
##  19.95   15  0.1739
## 
## Details on meta-analytical method:
## - Inverse variance method
## - Restricted maximum-likelihood estimator for tau^2
## - Q-Profile method for confidence interval of tau^2 and tau
## - Hartung-Knapp adjustment for random effects model (df = 15)
## - Prediction interval based on t-distribution (df = 14)

2.7.2 Influence Analysis

  • When find an overall effect in our meta-analysis, but its significance depends on a single large study. This would mean that the pooled effect is not statistically significant anymore once the influential study is removed.
  • Based on the leave-one-out (LOO) method, we recalculate the results of our meta-analysis K times, each time leaving out one study.
  • The InfluenceAnalysis function creates four influence diagnostic plots: a Baujat plot, influence diagnostics according to Viechtbauer and Cheung (2010), and the leave-one-out meta-analysis results, sorted by effect size and I^2 value.
m.gen.inf <- InfluenceAnalysis(m.gen, random = TRUE)
## [===========================================================================] DONE
summary(m.gen.inf)
## Leave-One-Out Analysis (Sorted by I2) 
##  ----------------------------------- 
##                                 Effect  LLCI  ULCI    I2
## Omitting DanitzOrsillo           0.507 0.349 0.666 0.481
## Omitting Shapiro et al.          0.521 0.344 0.699 0.546
## Omitting de Vibe et al.          0.608 0.404 0.811 0.576
## Omitting Kang et al.             0.542 0.349 0.735 0.598
## Omitting Kuhlmann et al.         0.606 0.405 0.806 0.614
## Omitting Hintz et al.            0.601 0.391 0.811 0.636
## Omitting Gallego et al.          0.572 0.359 0.784 0.638
## Omitting Call et al.             0.574 0.362 0.786 0.642
## Omitting Frogeli et al.          0.579 0.364 0.793 0.644
## Omitting Cavanagh et al.         0.596 0.384 0.808 0.645
## Omitting Song & Lindquist        0.580 0.366 0.793 0.646
## Omitting Frazier et al.          0.595 0.380 0.809 0.647
## Omitting Lever Taylor et al.     0.593 0.381 0.805 0.647
## Omitting Warnecke et al.         0.580 0.367 0.794 0.647
## Omitting Hazlett-Stevens & Oren  0.585 0.371 0.800 0.648
## Omitting Phang et al.            0.584 0.371 0.797 0.648
## Omitting Rasanen et al.          0.590 0.378 0.802 0.648
## Omitting Ratanasiripong          0.583 0.372 0.794 0.648
## 
## 
## Influence Diagnostics 
##  ------------------- 
##                                 rstudent dffits cook.d cov.r QE.del   hat
## Omitting Call et al.               0.332  0.040  0.002 1.119 44.706 0.050
## Omitting Cavanagh et al.          -0.647 -0.209  0.047 1.145 45.066 0.063
## Omitting DanitzOrsillo             3.211  0.914  0.643 0.655 30.819 0.038
## Omitting de Vibe et al.           -1.377 -0.368  0.124 1.019 37.742 0.079
## Omitting Frazier et al.           -0.489 -0.190  0.040 1.184 45.317 0.073
## Omitting Frogeli et al.            0.136 -0.018  0.000 1.165 44.882 0.063
## Omitting Gallego et al.            0.396  0.059  0.004 1.129 44.260 0.057
## Omitting Hazlett-Stevens & Oren   -0.148 -0.092  0.009 1.166 45.447 0.060
## Omitting Hintz et al.             -0.899 -0.269  0.076 1.120 44.006 0.069
## Omitting Kang et al.               1.699  0.421  0.162 0.906 39.829 0.039
## Omitting Kuhlmann et al.          -1.448 -0.338  0.107 1.007 41.500 0.063
## Omitting Lever Taylor et al.      -0.520 -0.172  0.032 1.142 45.336 0.056
## Omitting Phang et al.             -0.107 -0.076  0.006 1.149 45.439 0.053
## Omitting Rasanen et al.           -0.400 -0.139  0.021 1.137 45.456 0.051
## Omitting Ratanasiripong           -0.143 -0.065  0.004 1.103 45.493 0.037
## Omitting Shapiro et al.            2.460  0.718  0.416 0.754 35.207 0.042
## Omitting Song & Lindquist          0.084 -0.029  0.001 1.152 45.146 0.057
## Omitting Warnecke et al.           0.049 -0.036  0.001 1.143 45.263 0.052
##                                 weight infl
## Omitting Call et al.             5.036     
## Omitting Cavanagh et al.         6.267     
## Omitting DanitzOrsillo           3.751    *
## Omitting de Vibe et al.          7.880     
## Omitting Frazier et al.          7.337     
## Omitting Frogeli et al.          6.274     
## Omitting Gallego et al.          5.703     
## Omitting Hazlett-Stevens & Oren  5.982     
## Omitting Hintz et al.            6.854     
## Omitting Kang et al.             3.860     
## Omitting Kuhlmann et al.         6.300     
## Omitting Lever Taylor et al.     5.586     
## Omitting Phang et al.            5.332     
## Omitting Rasanen et al.          5.086     
## Omitting Ratanasiripong          3.678     
## Omitting Shapiro et al.          4.165     
## Omitting Song & Lindquist        5.664     
## Omitting Warnecke et al.         5.246     
## 
## 
## Baujat Diagnostics (sorted by Heterogeneity Contribution) 
##  ------------------------------------------------------- 
##                                 HetContrib InfluenceEffectSize
## Omitting DanitzOrsillo              14.385               0.298
## Omitting Shapiro et al.             10.044               0.251
## Omitting de Vibe et al.              6.403               1.357
## Omitting Kang et al.                 5.552               0.121
## Omitting Kuhlmann et al.             3.746               0.256
## Omitting Hintz et al.                1.368               0.129
## Omitting Gallego et al.              1.183               0.060
## Omitting Call et al.                 0.768               0.028
## Omitting Frogeli et al.              0.582               0.039
## Omitting Cavanagh et al.             0.409               0.027
## Omitting Song & Lindquist            0.339               0.017
## Omitting Warnecke et al.             0.230               0.009
## Omitting Frazier et al.              0.164               0.021
## Omitting Lever Taylor et al.         0.159               0.008
## Omitting Phang et al.                0.061               0.003
## Omitting Hazlett-Stevens & Oren      0.052               0.003
## Omitting Rasanen et al.              0.044               0.002
## Omitting Ratanasiripong              0.010               0.000

2.7.2.1 Baujat Plot

  • The plot shows the contribution of each study to the overall heterogeneity (as measured by Cochran’s \(Q\)) on the horizontal axis, and its influence on the pooled effect size on the vertical axis.
  • The influence value is determined through the LOO method, and expresses the standardized difference of the overall effect when the study is included in the meta-analysis, versus when it is not included.
  • Studies on the right side of the plot can be regarded as potentially relevant cases since they contribute heavily to the overall heterogeneity in our meta-analysis (already detected before: DanitzOrsillo and Shairo et al.)
plot(m.gen.inf, "baujat")

2.7.2.2 Influence Diasgnostics

plot(m.gen.inf, "influence")

2.7.2.3 LOO Meta-Analysis Results

plot(m.gen.inf, "es")  # effect size

plot(m.gen.inf, "i2")  # Higgins & Thompson's I^2

2.7.3 GOSH Plot Analysis

  • Fit the same meta-analysis model to all possible subsets of our included studies. In contrast to the leave-one-out method, we therefore not only fit K models, but a model for all \(2^{k-1}\) possible study combinations. But this means quite computationally expensive.
  • The R implementation we cover here only fits a maximum of 1 million randomly selected models.
m.rma <- rma(yi = m.gen$TE,
             sei = m.gen$seTE,
             method = m.gen$method.tau,
             test = "knha")
res.gosh <- gosh(m.rma)
plot(res.gosh, alpha = 0.01)

  • The gosh.diagnostics function uses three cluster algorithms to detect patterns in our data: the k-means algorithm (Hartigan and Wong 1979), density reachability and connectivity clustering, or DBSCAN (Schubert et al. 2017) and gaussian mixture models (Fraley and Raftery 2002).
res.gosh.diag <- gosh.diagnostics(res.gosh, 
                                  km.params = list(centers = 2), 
                                  db.params = list(eps = 0.08, 
                                                   MinPts = 50))
##   
##  Perform Clustering... 
##  |==========================================================================================| DONE
res.gosh.diag
## GOSH Diagnostics 
## ================================ 
## 
##  - Number of K-means clusters detected: 2
##  - Number of DBSCAN clusters detected: 4
##  - Number of GMM clusters detected: 7
## 
##  Identification of potential outliers 
##  --------------------------------- 
## 
##  - K-means: Study 3, Study 16
##  - DBSCAN: Study 3, Study 4, Study 16
##  - Gaussian Mixture Model: Study 3, Study 4, Study 11, Study 16
plot(res.gosh.diag)

update(m.gen, exclude = c(3, 4, 16)) %>% summary()
## Review:     Third Wave Psychotherapies
## 
##                           SMD            95%-CI %W(random) exclude
## Call et al.            0.7091 [ 0.1979; 1.2203]        4.6        
## Cavanagh et al.        0.3549 [-0.0300; 0.7397]        8.1        
## DanitzOrsillo          1.7912 [ 1.1139; 2.4685]        0.0       *
## de Vibe et al.         0.1825 [-0.0484; 0.4133]        0.0       *
## Frazier et al.         0.4219 [ 0.1380; 0.7057]       14.8        
## Frogeli et al.         0.6300 [ 0.2458; 1.0142]        8.1        
## Gallego et al.         0.7249 [ 0.2846; 1.1652]        6.2        
## Hazlett-Stevens & Oren 0.5287 [ 0.1162; 0.9412]        7.0        
## Hintz et al.           0.2840 [-0.0453; 0.6133]       11.0        
## Kang et al.            1.2751 [ 0.6142; 1.9360]        2.7        
## Kuhlmann et al.        0.1036 [-0.2781; 0.4853]        8.2        
## Lever Taylor et al.    0.3884 [-0.0639; 0.8407]        5.8        
## Phang et al.           0.5407 [ 0.0619; 1.0196]        5.2        
## Rasanen et al.         0.4262 [-0.0794; 0.9317]        4.7        
## Ratanasiripong         0.5154 [-0.1731; 1.2039]        2.5        
## Shapiro et al.         1.4797 [ 0.8618; 2.0977]        0.0       *
## Song & Lindquist       0.6126 [ 0.1683; 1.0569]        6.1        
## Warnecke et al.        0.6000 [ 0.1120; 1.0880]        5.0        
## 
## Number of studies: k = 15
## 
##                              SMD           95%-CI    t  p-value
## Random effects model (HK) 0.4819 [0.3595; 0.6043] 8.44 < 0.0001
## Prediction interval              [0.3614; 0.6024]              
## 
## Quantifying heterogeneity:
##  tau^2 < 0.0001 [0.0000; 0.0955]; tau = 0.0012 [0.0000; 0.3091]
##  I^2 = 4.6% [0.0%; 55.7%]; H = 1.02 [1.00; 1.50]
## 
## Test of heterogeneity:
##      Q d.f. p-value
##  14.67   14  0.4011
## 
## Details on meta-analytical method:
## - Inverse variance method
## - Restricted maximum-likelihood estimator for tau^2
## - Q-Profile method for confidence interval of tau^2 and tau
## - Hartung-Knapp adjustment for random effects model (df = 14)
## - Prediction interval based on t-distribution (df = 13)

Let us assume we determined influential studies in our meta-analysis. In this case, it makes sense to also report the results of a sensitivity analysis in which these studies are excluded. To make it easy for readers to see the changes associated with removing the influential studies, we can create a table in which both the original results, as well as the results of the sensitivity analysis are displayed.

3 Forest Plots

  • A diamond shape represents the average effect. The length of the diamond symbolizes the confidence interval of the pooled result on the x-axis.
  • When the summary measure is a ratio (such as odds ratios or risk ratios), it is common to use a logarithmic scale on the x-axis.
  • “RevMan5” is a software name develped by Cochrane
png(file = "fig/doing-meta-analysis_2-1.png", 
    width = 3600, height = 2400, res = 300)

meta::forest(m.gen, 
             sortvar = TE, 
             layout = "RevMan5",  # Cochrane's Review Manager 5
             xlim = c(floor(min(m.gen$lower)), ceiling(max(m.gen$upper))), 
             prediction = TRUE, 
             print.I2 = TRUE, 
             print.I2.ci = TRUE, 
             print.tau2 = TRUE, 
             print.tau2.ci = TRUE, 
             # smlab = m.gen$sm, 
             label.left = NULL, 
             col.square = col.tw,
             col.square.line = col.bmc.sky, 
             col.diamond = col.bmc.pink,
             # col.diamond.common = col.bmc.pink, 
             col.diamond.random = col.tw, 
             col.diamond.lines = col.bmc.pink, 
             col.predict = col.sl, 
             col.predict.lines = col.bmc.pink, 
             # leftcols = c("studlab", "TE", "seTE", "RiskOfBias"),
             # leftlabs = c("Author", "g", "SE", "Risk of Bias"),
             fontfamily = "Times New Roman", 
             colgap = "15mm")
dev.off()
## png 
##   2

4 Drapery Plots

  • Drapery plots are based on p-value functions. Such p-value functions have been proposed to prevent us from solely relying on the p<0.05 significance threshold when interpreting the results of an analysis.
  • Therefore, instead of only calculating the 95% confidence interval, p-value functions provide a continuous curve which shows the confidence interval for varying values of p.
drapery(m.gen,
        labels = "studlab",
        type = "pval",
        legend = FALSE)

5 Subgroup Analyses

5.1 The Fixed-Effects Plural Model (Mixed-Effects Model)

  • A meta-ragression (mixed-effects model) with a categorical predictor. \[ D_g = 0 \; (\mathrm{Subgroup \; A}) \\ D_g = 1 \; (\mathrm{Subgroup \; B}) \\ \hat{\theta_k} = \theta + \beta \; D_g + \epsilon_k + \zeta_k \] \(\theta\): the intercept in our regression model. the true overall effect size of subgroup A
    \(\beta\): the effect size difference between subgroup A and subgroup B.

  • An objective is to reject the null hypothesis that there is no difference in effect sizes between subgroups (e.g. \(Q\) test), regarding the pooled effect of a subgroup as the observed effect size of one large study.

  • The difference to a normal meta-analysis is that we conduct several separate random-effects meta-analyses, one for each subgroup. Studies within a subgroup are drawn from a universe of populations, the mean of which we want to estimate.

  • Since each subgroup gets its own separate meta-analysis, estimated heterogeneity \(\hat{\tau_g}^2\) will also differ from subgroup to subgroup.

  • When the number of studies in a subgroup is small (< 5), it is better to calculate a pooled version of \(\tau^2\) that is used across all subgroups, than to rely on a very imprecise estimate of the between-study heterogeneity in one subgroup \(\hat{\tau_g}^2\).

  • At least K=10 studies are required to do subgroup analyses in each group.

The Fixed-Effects Plural Model
The Fixed-Effects Plural Model
update(m.gen, 
       subgroup = RiskOfBias, 
       tau.common = FALSE)
## Review:     Third Wave Psychotherapies
## 
## Number of studies: k = 18
## 
##                              SMD            95%-CI    t  p-value
## Random effects model (HK) 0.5771 [ 0.3782; 0.7760] 6.12 < 0.0001
## Prediction interval              [-0.0572; 1.2115]              
## 
## Quantifying heterogeneity:
##  tau^2 = 0.0820 [0.0295; 0.3533]; tau = 0.2863 [0.1717; 0.5944]
##  I^2 = 62.6% [37.9%; 77.5%]; H = 1.64 [1.27; 2.11]
## 
## Test of heterogeneity:
##      Q d.f. p-value
##  45.50   17  0.0002
## 
## Results for subgroups (random effects model (HK)):
##                     k    SMD           95%-CI  tau^2    tau     Q   I^2
## RiskOfBias = high   7 0.8126 [0.2835; 1.3417] 0.2423 0.4922 25.89 76.8%
## RiskOfBias = low   11 0.4300 [0.2770; 0.5830] 0.0099 0.0997 13.42 25.5%
## 
## Test for subgroup differences (random effects model (HK)):
##                   Q d.f. p-value
## Between groups 2.84    1  0.0917
## 
## Details on meta-analytical method:
## - Inverse variance method
## - Restricted maximum-likelihood estimator for tau^2
## - Q-Profile method for confidence interval of tau^2 and tau
## - Hartung-Knapp adjustment for random effects model (df = 17)
## - Prediction interval based on t-distribution (df = 16)

6 Meta-Regression (Mixed-Effects Model)

\[ \hat{\theta}_k = \theta + \beta \; x_k + \epsilon_k + \zeta_k \]

  • Perform a regression with predictors on a study level.
  • Subgroup analysis is a meta-regression with a categorical predictor.
  • The random-effects model to pool effect sizes is nothing but a meta-regression model without a slope term, \(\beta\).
  • In meta-regression, a modified method called weighted least squares (WLS) is used, which makes sure that studies with a smaller standard error are given a higher weight.
  • \(R^2\) in meta-regression is slightly different to the one used in conventional regressions. \(R^2_*\) uses the amount of residual heterogeneity variance that even the meta-regression slope cannot explain, and puts it in relation to the total heterogeneity that we initially found in our meta-analysis.

\[ R^2_* = 1 - \frac{\hat{\tau}^2_{unexplained}}{\hat{\tau}^2_{(total)}} \]

Meta-regression with a continuous predictor and four studies
Meta-regression with a continuous predictor and four studies
# Add the publication years to ThirdWave dataset.
year <- c(2014, 1998, 2010, 1999, 2005, 2014, 
          2019, 2010, 1982, 2020, 1978, 2001, 
          2018, 2002, 2009, 2011, 2011, 2013)
m.gen.reg <- metareg(m.gen, ~ year)

7 Power Analysis

power.analysis(
    d = 0.2,
    k = 10,
    n1 = 25,
    n2 = 25,
    p = 0.05
)
## Fixed-effect model used.

## Power: 60.66%
power.analysis(
    d = 0.5,
    k = 10,
    n1 = 25,
    n2 = 25,
    p = 0.05,
    heterogeneity = "moderate"
)
## Random-effects model used (moderate heterogeneity assumed).

## Power: 98.93%

8 References